Applied Text Analytics for Comments on News-Articles A Bachelor Thesis
نویسنده
چکیده
Several on-line daily newspapers offer readers the opportunity to directly comment on articles. In the Netherlands this feature is used quite often and the quality (grammatically and content-wise) is surprisingly high. The paper develops techniques to collect, store, enrich and analyze these comments. After giving a high-level overview of the Dutch ‘commentosphere’ we zoom in on extracting the discussion structure found in flat comment threads; people not only comment on the news article, they also heavily comment on other comments, resembling discussion fora. We show how techniques from information retrieval, natural language processing and machine learning can be used to extract the ‘reacts-on’ relation between comments with remarkably high precision and recall.
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملSentiment analysis methods in Sentiment analysis methods in Persian text: A survey
With the explosive growth of social media such as Twitter, reviews on e-commerce website, and comments on news websites, individuals and organizations are increasingly using opinions in these media for their decision making. Sentiment analysis is one of the techniques used to analyze userschr('39') opinions in recent years. Persian language has specific features and thereby requires unique meth...
متن کاملUser Activity Analytics on the Social Web of News
The proliferation of social media is undoubtedly changing the way people produce and consume news online. Editors and publishers in newsrooms need to understand user engagement and audience sentiment evolution on various news topics. News consumers want to explore public reaction on articles relevant to a topic and refine their exploration via related entities, topics, articles and tweets. I wi...
متن کاملVisual and Predictive Analytics on Singapore News: Experiments on GDELT, Wikipedia, and ^STI
The open-source Global Database of Events, Language, and Tone (GDELT) is the most comprehensive and updated Big Data source of important terms extracted from international news articles . We focus only on GDELT’s Singapore events to better understand the data quality of its news articles, accuracy of its term extraction, and potential for prediction. To test news completeness and validity, we v...
متن کاملThe Elephant Will Kick the Donkey: An Attitudinal Analysis of Online Comments on the GOP Letter to the Iranian Supreme Leader
With the growth in sociality and interaction around touching national and international topics, news sites are increasingly becoming places for communities to discuss and address issues spurred by news articles. Proponents of cyberspace promise that online discourse will increase political participation and pave the road for a democratic utopia (Papacharissi, 2004). Comment writers, according t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007